Exploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds
نویسندگان
چکیده
This paper explores two hypotheses regarding vector space models that predict the compositionality of German noun-noun compounds: (1) Against our intuition, we demonstrate that window-based rather than syntax-based distributional features perform better predictions, and that not adjectives or verbs but nouns represent the most salient part-of-speech. Our overall best result is state-of-the-art, reaching Spearman’s ρ = 0.65 with a wordspace model of nominal features from a 20word window of a 1.5 billion word web corpus. (2) While there are no significant differences in predicting compound–modifier vs. compound–head ratings on compositionality, we show that the modifier (rather than the head) properties predominantly influence the degree of compositionality of the compound.
منابع مشابه
The Role of Modifier and Head Properties in Predicting the Compositionality of English and German Noun-Noun Compounds: A Vector-Space Perspective
In this paper, we explore the role of constituent properties in English and German noun-noun compounds (corpus frequencies of the compounds and their constituents; productivity and ambiguity of the constituents; and semantic relations between the constituents), when predicting the degrees of compositionality of the compounds within a vector space model. The results demonstrate that the empirica...
متن کاملUsing Distributional Similarity of Multi-way Translations to Predict Multiword Expression Compositionality
We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into m...
متن کاملThe (Un)expected Effects of Applying Standard Cleansing Models to Human Ratings on Compositionality
Human ratings are an important source for evaluating computational models that predict compositionality, but like many data sets of human semantic judgements, are often fraught with uncertainty and noise. However, despite their importance, to our knowledge there has been no extensive look at the effects of cleansing methods on human rating data. This paper assesses two standard cleansing approa...
متن کاملAssociation Norms of German Noun Compounds
This paper introduces association norms of German noun compounds as a lexical-semantic resource for cognitive and computational linguistics research on compositionality. Based on an existing database of German noun compounds, we collected human associations to the compounds and their constituents within a web experiment. The current study describes the collection process and a part-of-speech an...
متن کاملGhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds
Interest in systematically exploring factors that have been found to influence the cognitive processing and representation of compounds, such as • frequency-based factors, i.e., the frequencies of the compounds and their constituents (e.g., van Jaarsveld & Rattink, 1988; Janssen et al., 2008); • the productivity (morphological family size), i.e., the number of compounds that share a constituent...
متن کامل